A Low-Rank Approximation for MDPs via Moment Coupling

نویسندگان

چکیده

Markov Decision Process Tayloring for Approximation Design Optimal control problems are difficult to solve on large state spaces, calling the development of approximate solution methods. In “A Low-rank MDPs via Moment Coupling,” Zhang and Gurvich introduce a novel framework decision processes (MDPs) that stands two pillars: (i) aggregation, as algorithmic infrastructure, (ii) central-limit-theorem-type approximations, mathematical underpinning. The theoretical guarantees grounded in approximation Bellman equation by partial differential (PDE) where, spirit central limit theorem, transition matrix controlled chain is reduced its local first second moments. Instead solving PDE, algorithm introduced paper constructs “sister”' (controlled) whose moments approximately identical with those focal chain. Because this moment matching, original sister coupled through facilitating optimality guarantees. Embedded into standard soft matching provides disciplined mechanism tune aggregation disaggregation probabilities.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Restricted Low-Rank Approximation via ADMM

The matrix low-rank approximation problem with additional convex constraints can find many applications and has been extensively studied before. However, this problem is shown to be nonconvex and NP-hard; most of the existing solutions are heuristic and application-dependent. In this paper, we show that, other than tons of application in current literature, this problem can be used to recover a...

متن کامل

Value function approximation via low-rank models

We propose a novel value function approximation technique for Markov decision processes. We consider the problem of compactly representing the state-action value function using a low-rank and sparse matrix model. The problem is to decompose a matrix that encodes the true value function into low-rank and sparse components, and we achieve this using Robust Principal Component Analysis (PCA). Unde...

متن کامل

Approximation Algorithms for l0-Low Rank Approximation

For any column A:,i the best response vector is 1, so A:,i1 T − A 0 = 2 n − 1 = 2(1 − 1/n) OPTF 1 OPTF 1 = n Boolean l0-rank-1 Theorem 3. (Sublinear) Given A ∈ 0,1 m×n with column adjacency arrays and with row and column sums, we can compute w.h.p. in time O min A 0 +m + n, ψB −1 m + n log(mn) vectors u, v such that A − uv 0 ≤ 1 + O ψB OPTB . Theorem 4. (Exact) Given A ∈ 0,1 m×n with OPTB / A 0...

متن کامل

Approximation Algorithms for $\ell_0$-Low Rank Approximation

We study the l0-Low Rank Approximation Problem, where the goal is, given anm×nmatrix A, to output a rank-k matrix A for which ‖A′ −A‖0 is minimized. Here, for a matrix B, ‖B‖0 denotes the number of its non-zero entries. This NP-hard variant of low rank approximation is natural for problems with no underlying metric, and its goal is to minimize the number of disagreeing data positions. We provid...

متن کامل

Low-rank Tensor Approximation

Approximating a tensor by another of lower rank is in general an ill posed problem. Yet, this kind of approximation is mandatory in the presence of measurement errors or noise. We show how tools recently developed in compressed sensing can be used to solve this problem. More precisely, a minimal angle between the columns of loading matrices allows to restore both existence and uniqueness of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Operations Research

سال: 2022

ISSN: ['1526-5463', '0030-364X']

DOI: https://doi.org/10.1287/opre.2022.2392